Picture for Yeyun Gong

Yeyun Gong

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Add code
Feb 02, 2026
Viaarxiv icon

Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability

Add code
Feb 02, 2026
Viaarxiv icon

Sigma-MoE-Tiny Technical Report

Add code
Dec 19, 2025
Figure 1 for Sigma-MoE-Tiny Technical Report
Figure 2 for Sigma-MoE-Tiny Technical Report
Figure 3 for Sigma-MoE-Tiny Technical Report
Figure 4 for Sigma-MoE-Tiny Technical Report
Viaarxiv icon

SIGMA: An AI-Empowered Training Stack on Early-Life Hardware

Add code
Dec 15, 2025
Figure 1 for SIGMA: An AI-Empowered Training Stack on Early-Life Hardware
Figure 2 for SIGMA: An AI-Empowered Training Stack on Early-Life Hardware
Figure 3 for SIGMA: An AI-Empowered Training Stack on Early-Life Hardware
Figure 4 for SIGMA: An AI-Empowered Training Stack on Early-Life Hardware
Viaarxiv icon

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

Add code
Oct 09, 2025
Viaarxiv icon

Training Matryoshka Mixture-of-Experts for Elastic Inference-Time Expert Utilization

Add code
Sep 30, 2025
Viaarxiv icon

PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning

Add code
Jun 17, 2025
Viaarxiv icon

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Add code
Jun 10, 2025
Viaarxiv icon

HiCaM: A Hierarchical-Causal Modification Framework for Long-Form Text Modification

Add code
May 30, 2025
Viaarxiv icon

How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective

Add code
May 27, 2025
Viaarxiv icon